Pandas - Introduction
In Python, pandas is one of the most popular libraries used for data analysis and data manipulation. It helps you easily work with structured data like tables, spreadsheets, or databases. Because of its powerful tools for handling data, pandas is widely used in data science, machine learning, statistics, and research. A Python library used for analyzing and organizing data. Built on top of NumPy, which provides fast numerical operations.
Main purpose:
import pandas as pd
data = {
"Name": ["Ram", "Sita", "Hari"],
"Age": [20, 22, 21]
}
df = pd.DataFrame(data)
print(df)
Some basics functions of dataframe object.
import pandas as pd
df = pd.read_csv("data.csv")
#Shows the first 5 rows of the dataset (default).
df.head()
#Displays the last 5 rows of the dataset.
df.tail()
#Gives a summary of the dataset, including column names, data types, and missing values.
df.info()
#Provides statistical summary (mean, min, max, etc.) for numerical columns.
df.describe()
#Shows the number of rows and columns in the dataset.
df.shape
#Displays the column names of the dataset.
df.columns
#Sorts the data based on a column.
df.sort_values("Age")
#drop(): Removes rows or columns from the dataset.
df.drop("Age", axis=1)
#groupby(): Groups data by a column and performs calculations.
df.groupby("Department").mean()